Overview

Dataset statistics

Number of variables21
Number of observations584524
Missing cells137396
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory93.7 MiB
Average record size in memory168.0 B

Variable types

Numeric7
Categorical7
DateTime3
Text3
Unsupported1

Alerts

item_id is highly overall correlated with Customer ID and 3 other fieldsHigh correlation
price is highly overall correlated with grand_totalHigh correlation
grand_total is highly overall correlated with priceHigh correlation
Month is highly overall correlated with M-YHigh correlation
Customer ID is highly overall correlated with item_id and 3 other fieldsHigh correlation
status is highly overall correlated with BI StatusHigh correlation
BI Status is highly overall correlated with statusHigh correlation
Year is highly overall correlated with item_id and 3 other fieldsHigh correlation
M-Y is highly overall correlated with item_id and 4 other fieldsHigh correlation
FY is highly overall correlated with item_id and 3 other fieldsHigh correlation
status is highly imbalanced (51.7%)Imbalance
sales_commission_code has 137175 (23.5%) missing valuesMissing
qty_ordered is highly skewed (γ1 = 184.7792386)Skewed
grand_total is highly skewed (γ1 = 254.9982139)Skewed
item_id has unique valuesUnique
increment_id is an unsupported type, check if it needs cleaning or further analysisUnsupported
grand_total has 9632 (1.6%) zerosZeros
discount_amount has 376306 (64.4%) zerosZeros

Reproduction

Analysis started2023-09-13 12:50:52.871403
Analysis finished2023-09-13 12:52:51.750405
Duration1 minute and 58.88 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

item_id
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct584524
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean565667.07
Minimum211131
Maximum905208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-09-13T15:52:52.131746image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum211131
5-th percentile247522.15
Q1395000.75
median568424.5
Q3739106.25
95-th percentile872504.85
Maximum905208
Range694077
Interquartile range (IQR)344105.5

Descriptive statistics

Standard deviation200121.17
Coefficient of variation (CV)0.35377907
Kurtosis-1.1845506
Mean565667.07
Median Absolute Deviation (MAD)171889
Skewness-0.047515772
Sum3.3064598 × 1011
Variance4.0048484 × 1010
MonotonicityNot monotonic
2023-09-13T15:52:52.504011image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
211131 1
 
< 0.1%
683031 1
 
< 0.1%
683022 1
 
< 0.1%
683023 1
 
< 0.1%
683030 1
 
< 0.1%
683029 1
 
< 0.1%
683033 1
 
< 0.1%
683032 1
 
< 0.1%
683035 1
 
< 0.1%
683026 1
 
< 0.1%
Other values (584514) 584514
> 99.9%
ValueCountFrequency (%)
211131 1
< 0.1%
211133 1
< 0.1%
211134 1
< 0.1%
211135 1
< 0.1%
211136 1
< 0.1%
211137 1
< 0.1%
211138 1
< 0.1%
211139 1
< 0.1%
211140 1
< 0.1%
211141 1
< 0.1%
ValueCountFrequency (%)
905208 1
< 0.1%
905207 1
< 0.1%
905206 1
< 0.1%
905205 1
< 0.1%
905204 1
< 0.1%
905202 1
< 0.1%
905200 1
< 0.1%
905199 1
< 0.1%
905198 1
< 0.1%
905196 1
< 0.1%

status
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)< 0.1%
Missing15
Missing (%)< 0.1%
Memory size4.5 MiB
complete
233685 
canceled
201249 
received
77290 
order_refunded
59529 
refund
 
8050
Other values (11)
 
4706

Length

Max length14
Median length8
Mean length8.5499334
Min length2

Characters and Unicode

Total characters4997513
Distinct characters24
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcomplete
2nd rowcanceled
3rd rowcanceled
4th rowcomplete
5th roworder_refunded

Common Values

ValueCountFrequency (%)
complete 233685
40.0%
canceled 201249
34.4%
received 77290
 
13.2%
order_refunded 59529
 
10.2%
refund 8050
 
1.4%
cod 2859
 
0.5%
paid 1159
 
0.2%
closed 494
 
0.1%
payment_review 57
 
< 0.1%
pending 48
 
< 0.1%
Other values (6) 89
 
< 0.1%

Length

2023-09-13T15:52:52.878044image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
complete 233685
40.0%
canceled 201249
34.4%
received 77290
 
13.2%
order_refunded 59529
 
10.2%
refund 8050
 
1.4%
cod 2859
 
0.5%
paid 1159
 
0.2%
closed 494
 
0.1%
payment_review 57
 
< 0.1%
pending 48
 
< 0.1%
Other values (6) 89
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e 1289167
25.8%
c 716863
14.3%
d 469815
 
9.4%
l 435466
 
8.7%
o 296631
 
5.9%
n 269032
 
5.4%
r 264027
 
5.3%
p 235003
 
4.7%
m 233742
 
4.7%
t 233742
 
4.7%
Other values (14) 554025
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4937912
98.8%
Connector Punctuation 59593
 
1.2%
Other Punctuation 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1289167
26.1%
c 716863
14.5%
d 469815
 
9.5%
l 435466
 
8.8%
o 296631
 
6.0%
n 269032
 
5.4%
r 264027
 
5.3%
p 235003
 
4.8%
m 233742
 
4.7%
t 233742
 
4.7%
Other values (11) 494424
 
10.0%
Connector Punctuation
ValueCountFrequency (%)
_ 59593
100.0%
Other Punctuation
ValueCountFrequency (%)
\ 4
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4937916
98.8%
Common 59597
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 1289167
26.1%
c 716863
14.5%
d 469815
 
9.5%
l 435466
 
8.8%
o 296631
 
6.0%
n 269032
 
5.4%
r 264027
 
5.3%
p 235003
 
4.8%
m 233742
 
4.7%
t 233742
 
4.7%
Other values (12) 494428
 
10.0%
Common
ValueCountFrequency (%)
_ 59593
> 99.9%
\ 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4997513
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 1289167
25.8%
c 716863
14.3%
d 469815
 
9.4%
l 435466
 
8.7%
o 296631
 
5.9%
n 269032
 
5.4%
r 264027
 
5.3%
p 235003
 
4.7%
m 233742
 
4.7%
t 233742
 
4.7%
Other values (14) 554025
11.1%
Distinct789
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
Minimum2016-07-01 00:00:00
Maximum2018-08-28 00:00:00
2023-09-13T15:52:53.220581image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:53.571256image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

sku
Text

Distinct84889
Distinct (%)14.5%
Missing20
Missing (%)< 0.1%
Memory size4.5 MiB
2023-09-13T15:52:54.418875image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length69
Median length63
Mean length20.254032
Min length5

Characters and Unicode

Total characters11838563
Distinct characters94
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38653 ?
Unique (%)6.6%

Sample

1st rowkreations_YI 06-L
2nd rowkcc_Buy 2 Frey Air Freshener & Get 1 Kasual Body Spray Free
3rd rowEgo_UP0017-999-MR0
4th rowkcc_krone deal
5th rowBK7010400AG
ValueCountFrequency (%)
17805
 
1.9%
infinix 5978
 
0.7%
halwa 5418
 
0.6%
hot 4859
 
0.5%
black 4616
 
0.5%
one 4385
 
0.5%
of 4104
 
0.4%
matsam59db75adb2f80 3775
 
0.4%
sohan 3718
 
0.4%
al 3661
 
0.4%
Other values (86322) 856654
93.6%
2023-09-13T15:52:55.789787image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 765823
 
6.5%
5 536900
 
4.5%
0 464376
 
3.9%
B 424348
 
3.6%
- 366768
 
3.1%
e 348343
 
2.9%
9 348133
 
2.9%
1 345339
 
2.9%
a 344867
 
2.9%
E 340693
 
2.9%
Other values (84) 7552973
63.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4517562
38.2%
Decimal Number 3268014
27.6%
Lowercase Letter 3026250
25.6%
Dash Punctuation 366787
 
3.1%
Space Separator 336856
 
2.8%
Connector Punctuation 297722
 
2.5%
Other Punctuation 14679
 
0.1%
Open Punctuation 4520
 
< 0.1%
Close Punctuation 4506
 
< 0.1%
Math Symbol 1525
 
< 0.1%
Other values (3) 142
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 348343
11.5%
a 344867
11.4%
i 247921
 
8.2%
o 208073
 
6.9%
l 201562
 
6.7%
r 200692
 
6.6%
n 195456
 
6.5%
t 166925
 
5.5%
s 143911
 
4.8%
c 121305
 
4.0%
Other values (18) 847195
28.0%
Uppercase Letter
ValueCountFrequency (%)
A 765823
17.0%
B 424348
 
9.4%
E 340693
 
7.5%
C 327481
 
7.2%
F 323007
 
7.2%
D 288921
 
6.4%
M 276654
 
6.1%
S 231839
 
5.1%
T 191365
 
4.2%
P 175903
 
3.9%
Other values (17) 1171528
25.9%
Decimal Number
ValueCountFrequency (%)
5 536900
16.4%
0 464376
14.2%
9 348133
10.7%
1 345339
10.6%
2 281248
8.6%
3 274352
8.4%
7 271984
8.3%
4 262229
8.0%
6 242830
7.4%
8 240623
7.4%
Other Punctuation
ValueCountFrequency (%)
& 5968
40.7%
. 3473
23.7%
/ 1987
 
13.5%
' 1174
 
8.0%
, 922
 
6.3%
# 392
 
2.7%
" 280
 
1.9%
: 251
 
1.7%
% 231
 
1.6%
\ 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 1502
98.5%
| 12
 
0.8%
= 9
 
0.6%
× 2
 
0.1%
Control
ValueCountFrequency (%)
12
80.0%
2
 
13.3%
1
 
6.7%
Dash Punctuation
ValueCountFrequency (%)
- 366768
> 99.9%
19
 
< 0.1%
Space Separator
ValueCountFrequency (%)
336313
99.8%
  543
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 4495
99.4%
[ 25
 
0.6%
Close Punctuation
ValueCountFrequency (%)
) 4481
99.4%
] 25
 
0.6%
Other Symbol
ValueCountFrequency (%)
° 3
75.0%
® 1
 
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 297722
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 123
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7543812
63.7%
Common 4294751
36.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 765823
 
10.2%
B 424348
 
5.6%
e 348343
 
4.6%
a 344867
 
4.6%
E 340693
 
4.5%
C 327481
 
4.3%
F 323007
 
4.3%
D 288921
 
3.8%
M 276654
 
3.7%
i 247921
 
3.3%
Other values (45) 3855754
51.1%
Common
ValueCountFrequency (%)
5 536900
12.5%
0 464376
10.8%
- 366768
8.5%
9 348133
8.1%
1 345339
8.0%
336313
7.8%
_ 297722
 
6.9%
2 281248
 
6.5%
3 274352
 
6.4%
7 271984
 
6.3%
Other values (29) 771616
18.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11837950
> 99.9%
None 594
 
< 0.1%
Punctuation 19
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 765823
 
6.5%
5 536900
 
4.5%
0 464376
 
3.9%
B 424348
 
3.6%
- 366768
 
3.1%
e 348343
 
2.9%
9 348133
 
2.9%
1 345339
 
2.9%
a 344867
 
2.9%
E 340693
 
2.9%
Other values (76) 7552360
63.8%
None
ValueCountFrequency (%)
  543
91.4%
è 20
 
3.4%
È 13
 
2.2%
é 12
 
2.0%
° 3
 
0.5%
× 2
 
0.3%
® 1
 
0.2%
Punctuation
ValueCountFrequency (%)
19
100.0%

price
Real number (ℝ)

HIGH CORRELATION 

Distinct9121
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6348.7475
Minimum0
Maximum1012625.9
Zeros2232
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-09-13T15:52:56.192945image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile95
Q1360
median899
Q34070
95-th percentile29199
Maximum1012625.9
Range1012625.9
Interquartile range (IQR)3710

Descriptive statistics

Standard deviation14949.27
Coefficient of variation (CV)2.3546801
Kurtosis71.248367
Mean6348.7475
Median Absolute Deviation (MAD)699
Skewness5.228716
Sum3.7109953 × 109
Variance2.2348066 × 108
MonotonicityNot monotonic
2023-09-13T15:52:56.568643image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 11077
 
1.9%
999 10120
 
1.7%
500 9215
 
1.6%
699 8407
 
1.4%
499 7693
 
1.3%
399 6999
 
1.2%
299 6647
 
1.1%
799 6024
 
1.0%
599 6008
 
1.0%
899 4471
 
0.8%
Other values (9111) 507863
86.9%
ValueCountFrequency (%)
0 2232
0.4%
0.1 15
 
< 0.1%
0.11 8
 
< 0.1%
0.15 1
 
< 0.1%
0.2 16
 
< 0.1%
0.8 9
 
< 0.1%
1 1237
0.2%
1.3 24
 
< 0.1%
1.5 20
 
< 0.1%
1.6 34
 
< 0.1%
ValueCountFrequency (%)
1012625.9 1
 
< 0.1%
515975 1
 
< 0.1%
479000 4
< 0.1%
330499 2
< 0.1%
320000 1
 
< 0.1%
307970 2
< 0.1%
300000 4
< 0.1%
291667 2
< 0.1%
289999 1
 
< 0.1%
265499 1
 
< 0.1%

qty_ordered
Real number (ℝ)

SKEWED 

Distinct74
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2963882
Minimum1
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-09-13T15:52:56.945317image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum1000
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation3.9960611
Coefficient of variation (CV)3.0824572
Kurtosis42587.952
Mean1.2963882
Median Absolute Deviation (MAD)0
Skewness184.77924
Sum757770
Variance15.968504
MonotonicityNot monotonic
2023-09-13T15:52:57.308736image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 505214
86.4%
2 46656
 
8.0%
3 15406
 
2.6%
5 11213
 
1.9%
4 4141
 
0.7%
6 589
 
0.1%
10 292
 
< 0.1%
8 153
 
< 0.1%
7 112
 
< 0.1%
12 95
 
< 0.1%
Other values (64) 653
 
0.1%
ValueCountFrequency (%)
1 505214
86.4%
2 46656
 
8.0%
3 15406
 
2.6%
4 4141
 
0.7%
5 11213
 
1.9%
6 589
 
0.1%
7 112
 
< 0.1%
8 153
 
< 0.1%
9 42
 
< 0.1%
10 292
 
< 0.1%
ValueCountFrequency (%)
1000 6
< 0.1%
502 1
 
< 0.1%
500 4
< 0.1%
380 1
 
< 0.1%
304 1
 
< 0.1%
300 2
 
< 0.1%
200 7
< 0.1%
187 1
 
< 0.1%
186 1
 
< 0.1%
185 1
 
< 0.1%

grand_total
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct36829
Distinct (%)6.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8530.6186
Minimum-1594
Maximum17888000
Zeros9632
Zeros (%)1.6%
Negative76
Negative (%)< 0.1%
Memory size4.5 MiB
2023-09-13T15:52:57.670721image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-1594
5-th percentile160
Q1945
median1960.4
Q36999
95-th percentile34497.65
Maximum17888000
Range17889594
Interquartile range (IQR)6054

Descriptive statistics

Standard deviation61320.815
Coefficient of variation (CV)7.1883198
Kurtosis74191.656
Mean8530.6186
Median Absolute Deviation (MAD)1461.4
Skewness254.99821
Sum4.9863513 × 109
Variance3.7602423 × 109
MonotonicityNot monotonic
2023-09-13T15:52:58.048501image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9632
 
1.6%
1000 5146
 
0.9%
2000 3781
 
0.6%
2500 3235
 
0.6%
5000 2912
 
0.5%
1149 2525
 
0.4%
12599 2520
 
0.4%
399 2459
 
0.4%
4000 2387
 
0.4%
150 2256
 
0.4%
Other values (36819) 547671
93.7%
ValueCountFrequency (%)
-1594 1
 
< 0.1%
-1311.5 7
< 0.1%
-1106.65 1
 
< 0.1%
-873.4 1
 
< 0.1%
-528 1
 
< 0.1%
-511 2
 
< 0.1%
-425.7 2
 
< 0.1%
-384 1
 
< 0.1%
-340.6 16
< 0.1%
-249 1
 
< 0.1%
ValueCountFrequency (%)
17888000 6
< 0.1%
1315875 1
 
< 0.1%
1280473 2
 
< 0.1%
1279980 1
 
< 0.1%
1155966 1
 
< 0.1%
1039479 1
 
< 0.1%
1028751 4
< 0.1%
1012625.9 1
 
< 0.1%
888065 7
< 0.1%
847024 7
< 0.1%

increment_id
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size4.5 MiB

category_name_1
Categorical

Distinct16
Distinct (%)< 0.1%
Missing164
Missing (%)< 0.1%
Memory size4.5 MiB
Mobiles & Tablets
115710 
Men's Fashion
92221 
Women's Fashion
59721 
Appliances
52413 
Superstore
43613 
Other values (11)
220682 

Length

Max length18
Median length17
Mean length12.839072
Min length2

Characters and Unicode

Total characters7502640
Distinct characters39
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWomen's Fashion
2nd rowBeauty & Grooming
3rd rowWomen's Fashion
4th rowBeauty & Grooming
5th rowSoghaat

Common Values

ValueCountFrequency (%)
Mobiles & Tablets 115710
19.8%
Men's Fashion 92221
15.8%
Women's Fashion 59721
10.2%
Appliances 52413
9.0%
Superstore 43613
 
7.5%
Beauty & Grooming 41496
 
7.1%
Soghaat 34011
 
5.8%
Others 29218
 
5.0%
Home & Living 26504
 
4.5%
Entertainment 26326
 
4.5%
Other values (6) 63127
10.8%

Length

2023-09-13T15:52:58.428532image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
221184
18.8%
fashion 151942
12.9%
mobiles 115710
9.8%
tablets 115710
9.8%
men's 92221
 
7.8%
women's 59721
 
5.1%
appliances 52413
 
4.4%
superstore 43613
 
3.7%
beauty 41496
 
3.5%
grooming 41496
 
3.5%
Other values (14) 243164
20.6%

Most occurring characters

ValueCountFrequency (%)
s 696414
 
9.3%
e 690373
 
9.2%
594310
 
7.9%
o 562102
 
7.5%
n 522686
 
7.0%
a 493383
 
6.6%
i 476800
 
6.4%
t 397441
 
5.3%
l 304813
 
4.1%
b 247914
 
3.3%
Other values (29) 2516404
33.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5569868
74.2%
Uppercase Letter 957486
 
12.8%
Space Separator 594310
 
7.9%
Other Punctuation 380976
 
5.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 696414
12.5%
e 690373
12.4%
o 562102
10.1%
n 522686
9.4%
a 493383
8.9%
i 476800
8.6%
t 397441
7.1%
l 304813
 
5.5%
b 247914
 
4.5%
h 236151
 
4.2%
Other values (10) 941791
16.9%
Uppercase Letter
ValueCountFrequency (%)
M 207931
21.7%
F 151942
15.9%
T 115710
12.1%
S 98604
10.3%
B 59860
 
6.3%
W 59721
 
6.2%
A 52413
 
5.5%
H 44006
 
4.6%
G 41496
 
4.3%
E 29804
 
3.1%
Other values (5) 95999
10.0%
Other Punctuation
ValueCountFrequency (%)
& 221184
58.1%
' 151942
39.9%
\ 7850
 
2.1%
Space Separator
ValueCountFrequency (%)
594310
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6527354
87.0%
Common 975286
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 696414
 
10.7%
e 690373
 
10.6%
o 562102
 
8.6%
n 522686
 
8.0%
a 493383
 
7.6%
i 476800
 
7.3%
t 397441
 
6.1%
l 304813
 
4.7%
b 247914
 
3.8%
h 236151
 
3.6%
Other values (25) 1899277
29.1%
Common
ValueCountFrequency (%)
594310
60.9%
& 221184
 
22.7%
' 151942
 
15.6%
\ 7850
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7502640
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 696414
 
9.3%
e 690373
 
9.2%
594310
 
7.9%
o 562102
 
7.5%
n 522686
 
7.0%
a 493383
 
6.6%
i 476800
 
6.4%
t 397441
 
5.3%
l 304813
 
4.1%
b 247914
 
3.3%
Other values (29) 2516404
33.5%

sales_commission_code
Text

MISSING 

Distinct7225
Distinct (%)1.6%
Missing137175
Missing (%)23.5%
Memory size4.5 MiB
2023-09-13T15:52:59.060150image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length100
Median length2
Mean length4.0094088
Min length1

Characters and Unicode

Total characters1793605
Distinct characters88
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3126 ?
Unique (%)0.7%

Sample

1st row\N
2nd row\N
3rd row\N
4th rowR-FSD-52352
5th row\N
ValueCountFrequency (%)
n 339002
73.3%
c 4272
 
0.9%
skz 2807
 
0.6%
40968 2711
 
0.6%
c-lhc-30667 2654
 
0.6%
c-lhw-50074 2436
 
0.5%
r 2366
 
0.5%
cisb30211 2334
 
0.5%
c-rwp-31924 1938
 
0.4%
cmux33202 1883
 
0.4%
Other values (6333) 100087
 
21.6%
2023-09-13T15:53:00.215491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 339256
18.9%
\ 339001
18.9%
- 147598
 
8.2%
1 93561
 
5.2%
0 92891
 
5.2%
4 79416
 
4.4%
3 57271
 
3.2%
C 54301
 
3.0%
2 47864
 
2.7%
6 45449
 
2.5%
Other values (78) 496997
27.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 663978
37.0%
Decimal Number 564880
31.5%
Other Punctuation 339781
18.9%
Dash Punctuation 147598
 
8.2%
Lowercase Letter 58671
 
3.3%
Space Separator 15594
 
0.9%
Connector Punctuation 3070
 
0.2%
Math Symbol 22
 
< 0.1%
Format 4
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Other values (4) 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 339256
51.1%
C 54301
 
8.2%
R 41488
 
6.2%
H 29441
 
4.4%
U 21270
 
3.2%
S 20710
 
3.1%
W 19366
 
2.9%
D 17800
 
2.7%
M 15053
 
2.3%
X 14595
 
2.2%
Other values (17) 90698
 
13.7%
Lowercase Letter
ValueCountFrequency (%)
c 10452
17.8%
s 6115
10.4%
u 5500
 
9.4%
i 3486
 
5.9%
d 3392
 
5.8%
b 3268
 
5.6%
h 3158
 
5.4%
f 2890
 
4.9%
x 2636
 
4.5%
m 2610
 
4.4%
Other values (17) 15164
25.8%
Other Punctuation
ValueCountFrequency (%)
\ 339001
99.8%
/ 434
 
0.1%
. 296
 
0.1%
# 10
 
< 0.1%
, 8
 
< 0.1%
@ 7
 
< 0.1%
: 4
 
< 0.1%
? 4
 
< 0.1%
! 4
 
< 0.1%
" 4
 
< 0.1%
Other values (3) 9
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 93561
16.6%
0 92891
16.4%
4 79416
14.1%
3 57271
10.1%
2 47864
8.5%
6 45449
8.0%
5 44564
7.9%
9 37008
 
6.6%
8 34037
 
6.0%
7 32819
 
5.8%
Math Symbol
ValueCountFrequency (%)
+ 20
90.9%
= 2
 
9.1%
Dash Punctuation
ValueCountFrequency (%)
- 147598
100.0%
Space Separator
ValueCountFrequency (%)
15594
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3070
100.0%
Format
ValueCountFrequency (%)
­ 4
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Control
ValueCountFrequency (%)
 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Nonspacing Mark
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1070955
59.7%
Latin 722649
40.3%
Inherited 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 339256
46.9%
C 54301
 
7.5%
R 41488
 
5.7%
H 29441
 
4.1%
U 21270
 
2.9%
S 20710
 
2.9%
W 19366
 
2.7%
D 17800
 
2.5%
M 15053
 
2.1%
X 14595
 
2.0%
Other values (44) 149369
20.7%
Common
ValueCountFrequency (%)
\ 339001
31.7%
- 147598
13.8%
1 93561
 
8.7%
0 92891
 
8.7%
4 79416
 
7.4%
3 57271
 
5.3%
2 47864
 
4.5%
6 45449
 
4.2%
5 44564
 
4.2%
9 37008
 
3.5%
Other values (23) 86332
 
8.1%
Inherited
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1793596
> 99.9%
None 7
 
< 0.1%
Dingbats 1
 
< 0.1%
VS 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 339256
18.9%
\ 339001
18.9%
- 147598
 
8.2%
1 93561
 
5.2%
0 92891
 
5.2%
4 79416
 
4.4%
3 57271
 
3.2%
C 54301
 
3.0%
2 47864
 
2.7%
6 45449
 
2.5%
Other values (73) 496988
27.7%
None
ValueCountFrequency (%)
­ 4
57.1%
Ç 2
28.6%
ß 1
 
14.3%
Dingbats
ValueCountFrequency (%)
1
100.0%
VS
ValueCountFrequency (%)
1
100.0%

discount_amount
Real number (ℝ)

ZEROS 

Distinct28058
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean499.49278
Minimum-599.5
Maximum90300
Zeros376306
Zeros (%)64.4%
Negative3
Negative (%)< 0.1%
Memory size4.5 MiB
2023-09-13T15:53:00.676247image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-599.5
5-th percentile0
Q10
median0
Q3160.5
95-th percentile2930
Maximum90300
Range90899.5
Interquartile range (IQR)160.5

Descriptive statistics

Standard deviation1506.943
Coefficient of variation (CV)3.0169466
Kurtosis117.5672
Mean499.49278
Median Absolute Deviation (MAD)0
Skewness6.8424543
Sum2.9196551 × 108
Variance2270877.3
MonotonicityNot monotonic
2023-09-13T15:53:01.112981image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 376306
64.4%
1000 5426
 
0.9%
2000 4143
 
0.7%
500 3648
 
0.6%
200 3496
 
0.6%
3000 2361
 
0.4%
7000 2289
 
0.4%
1500 1625
 
0.3%
4000 1429
 
0.2%
2200 1409
 
0.2%
Other values (28048) 182392
31.2%
ValueCountFrequency (%)
-599.5 1
 
< 0.1%
-2 2
 
< 0.1%
0 376306
64.4%
0.08 1
 
< 0.1%
0.1 6
 
< 0.1%
0.14 1
 
< 0.1%
0.15 36
 
< 0.1%
0.16 1
 
< 0.1%
0.19 1
 
< 0.1%
0.2 12
 
< 0.1%
ValueCountFrequency (%)
90300 2
 
< 0.1%
50355.25 5
< 0.1%
50127.75 5
< 0.1%
48205 1
 
< 0.1%
47500 4
< 0.1%
45000 1
 
< 0.1%
42498.75 1
 
< 0.1%
41885 1
 
< 0.1%
38075.07 8
< 0.1%
35609.39 1
 
< 0.1%

payment_method
Categorical

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
cod
271960 
Payaxis
97641 
Easypay
82900 
jazzwallet
35145 
easypay_voucher
31176 
Other values (13)
65702 

Length

Max length17
Median length16
Mean length6.1900538
Min length3

Characters and Unicode

Total characters3618235
Distinct characters30
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcod
2nd rowcod
3rd rowcod
4th rowcod
5th rowcod

Common Values

ValueCountFrequency (%)
cod 271960
46.5%
Payaxis 97641
 
16.7%
Easypay 82900
 
14.2%
jazzwallet 35145
 
6.0%
easypay_voucher 31176
 
5.3%
bankalfalah 23065
 
3.9%
jazzvoucher 15633
 
2.7%
Easypay_MA 14028
 
2.4%
customercredit 7555
 
1.3%
apg 1758
 
0.3%
Other values (8) 3663
 
0.6%

Length

2023-09-13T15:53:01.485764image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cod 271960
46.5%
payaxis 97641
 
16.7%
easypay 82900
 
14.2%
jazzwallet 35145
 
6.0%
easypay_voucher 31176
 
5.3%
bankalfalah 23065
 
3.9%
jazzvoucher 15633
 
2.7%
easypay_ma 14028
 
2.4%
customercredit 7555
 
1.3%
apg 1758
 
0.3%
Other values (8) 3663
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a 635647
17.6%
y 355187
 
9.8%
c 337363
 
9.3%
o 327913
 
9.1%
d 282261
 
7.8%
s 234824
 
6.5%
e 132555
 
3.7%
p 130764
 
3.6%
l 118040
 
3.3%
i 107930
 
3.0%
Other values (20) 955751
26.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3350406
92.6%
Uppercase Letter 222625
 
6.2%
Connector Punctuation 45204
 
1.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 635647
19.0%
y 355187
10.6%
c 337363
10.1%
o 327913
9.8%
d 282261
8.4%
s 234824
 
7.0%
e 132555
 
4.0%
p 130764
 
3.9%
l 118040
 
3.5%
i 107930
 
3.2%
Other values (15) 687922
20.5%
Uppercase Letter
ValueCountFrequency (%)
P 97641
43.9%
E 96928
43.5%
M 14028
 
6.3%
A 14028
 
6.3%
Connector Punctuation
ValueCountFrequency (%)
_ 45204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3573031
98.8%
Common 45204
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 635647
17.8%
y 355187
 
9.9%
c 337363
 
9.4%
o 327913
 
9.2%
d 282261
 
7.9%
s 234824
 
6.6%
e 132555
 
3.7%
p 130764
 
3.7%
l 118040
 
3.3%
i 107930
 
3.0%
Other values (19) 910547
25.5%
Common
ValueCountFrequency (%)
_ 45204
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3618235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 635647
17.6%
y 355187
 
9.8%
c 337363
 
9.3%
o 327913
 
9.1%
d 282261
 
7.8%
s 234824
 
6.5%
e 132555
 
3.7%
p 130764
 
3.6%
l 118040
 
3.3%
i 107930
 
3.0%
Other values (20) 955751
26.4%
Distinct789
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
Minimum2016-07-01 00:00:00
Maximum2018-08-28 00:00:00
2023-09-13T15:53:01.905525image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:53:02.329300image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BI Status
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
Net
234178 
Gross
201454 
Valid
148891 
#REF!
 
1

Length

Max length5
Median length5
Mean length4.1987395
Min length3

Characters and Unicode

Total characters2454264
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row#REF!
2nd rowGross
3rd rowGross
4th rowNet
5th rowValid

Common Values

ValueCountFrequency (%)
Net 234178
40.1%
Gross 201454
34.5%
Valid 148891
25.5%
#REF! 1
 
< 0.1%

Length

2023-09-13T15:53:02.754037image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-13T15:53:03.304034image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
net 234178
40.1%
gross 201454
34.5%
valid 148891
25.5%
ref 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
s 402908
16.4%
N 234178
9.5%
t 234178
9.5%
e 234178
9.5%
G 201454
8.2%
r 201454
8.2%
o 201454
8.2%
i 148891
 
6.1%
d 148891
 
6.1%
a 148891
 
6.1%
Other values (7) 297787
12.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1869736
76.2%
Uppercase Letter 584526
 
23.8%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 402908
21.5%
t 234178
12.5%
e 234178
12.5%
r 201454
10.8%
o 201454
10.8%
i 148891
 
8.0%
d 148891
 
8.0%
a 148891
 
8.0%
l 148891
 
8.0%
Uppercase Letter
ValueCountFrequency (%)
N 234178
40.1%
G 201454
34.5%
V 148891
25.5%
R 1
 
< 0.1%
E 1
 
< 0.1%
F 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
# 1
50.0%
! 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2454262
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 402908
16.4%
N 234178
9.5%
t 234178
9.5%
e 234178
9.5%
G 201454
8.2%
r 201454
8.2%
o 201454
8.2%
i 148891
 
6.1%
d 148891
 
6.1%
a 148891
 
6.1%
Other values (5) 297785
12.1%
Common
ValueCountFrequency (%)
# 1
50.0%
! 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2454264
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 402908
16.4%
N 234178
9.5%
t 234178
9.5%
e 234178
9.5%
G 201454
8.2%
r 201454
8.2%
o 201454
8.2%
i 148891
 
6.1%
d 148891
 
6.1%
a 148891
 
6.1%
Other values (7) 297787
12.1%

MV
Text

Distinct9720
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2023-09-13T15:53:04.145551image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length9
Median length7
Mean length4.1444731
Min length1

Characters and Unicode

Total characters2422544
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2233 ?
Unique (%)0.4%

Sample

1st row1,950
2nd row240
3rd row2,450
4th row360
5th row1,110
ValueCountFrequency (%)
999 9516
 
1.6%
699 7801
 
1.3%
499 7157
 
1.2%
1,000 6895
 
1.2%
399 6506
 
1.1%
299 5997
 
1.0%
599 5721
 
1.0%
799 5715
 
1.0%
2,000 4499
 
0.8%
899 4353
 
0.7%
Other values (9710) 520364
89.0%
2023-09-13T15:53:05.458818image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 414466
17.1%
9 399121
16.5%
, 289367
11.9%
1 247310
10.2%
5 220128
9.1%
2 188243
7.8%
4 151901
 
6.3%
3 146616
 
6.1%
8 128646
 
5.3%
6 117186
 
4.8%
Other values (3) 119560
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2122017
87.6%
Other Punctuation 289367
 
11.9%
Space Separator 8928
 
0.4%
Dash Punctuation 2232
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 414466
19.5%
9 399121
18.8%
1 247310
11.7%
5 220128
10.4%
2 188243
8.9%
4 151901
 
7.2%
3 146616
 
6.9%
8 128646
 
6.1%
6 117186
 
5.5%
7 108400
 
5.1%
Other Punctuation
ValueCountFrequency (%)
, 289367
100.0%
Space Separator
ValueCountFrequency (%)
8928
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2232
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2422544
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 414466
17.1%
9 399121
16.5%
, 289367
11.9%
1 247310
10.2%
5 220128
9.1%
2 188243
7.8%
4 151901
 
6.3%
3 146616
 
6.1%
8 128646
 
5.3%
6 117186
 
4.8%
Other values (3) 119560
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2422544
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 414466
17.1%
9 399121
16.5%
, 289367
11.9%
1 247310
10.2%
5 220128
9.1%
2 188243
7.8%
4 151901
 
6.3%
3 146616
 
6.1%
8 128646
 
5.3%
6 117186
 
4.8%
Other values (3) 119560
 
4.9%

Year
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2017
290920 
2018
159695 
2016
133909 

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2338096
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2017 290920
49.8%
2018 159695
27.3%
2016 133909
22.9%

Length

2023-09-13T15:53:05.821839image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-13T15:53:06.122976image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
2017 290920
49.8%
2018 159695
27.3%
2016 133909
22.9%

Most occurring characters

ValueCountFrequency (%)
2 584524
25.0%
0 584524
25.0%
1 584524
25.0%
7 290920
12.4%
8 159695
 
6.8%
6 133909
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2338096
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 584524
25.0%
0 584524
25.0%
1 584524
25.0%
7 290920
12.4%
8 159695
 
6.8%
6 133909
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 2338096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 584524
25.0%
0 584524
25.0%
1 584524
25.0%
7 290920
12.4%
8 159695
 
6.8%
6 133909
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2338096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 584524
25.0%
0 584524
25.0%
1 584524
25.0%
7 290920
12.4%
8 159695
 
6.8%
6 133909
 
5.7%

Month
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.1676544
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-09-13T15:53:06.368818image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q14
median7
Q311
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.4863048
Coefficient of variation (CV)0.4863941
Kurtosis-1.3726432
Mean7.1676544
Median Absolute Deviation (MAD)4
Skewness-0.1929145
Sum4189666
Variance12.154321
MonotonicityNot monotonic
2023-09-13T15:53:06.622995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
11 155456
26.6%
5 62603
10.7%
3 61489
 
10.5%
8 48514
 
8.3%
7 39151
 
6.7%
2 38777
 
6.6%
6 34530
 
5.9%
4 34091
 
5.8%
10 30623
 
5.2%
12 29199
 
5.0%
Other values (2) 50091
 
8.6%
ValueCountFrequency (%)
1 26067
4.5%
2 38777
6.6%
3 61489
10.5%
4 34091
5.8%
5 62603
10.7%
6 34530
5.9%
7 39151
6.7%
8 48514
8.3%
9 24024
 
4.1%
10 30623
5.2%
ValueCountFrequency (%)
12 29199
 
5.0%
11 155456
26.6%
10 30623
 
5.2%
9 24024
 
4.1%
8 48514
 
8.3%
7 39151
 
6.7%
6 34530
 
5.9%
5 62603
10.7%
4 34091
 
5.8%
3 61489
 
10.5%
Distinct26
Distinct (%)< 0.1%
Missing11
Missing (%)< 0.1%
Memory size4.5 MiB
Minimum2016-07-01 00:00:00
Maximum2018-08-01 00:00:00
2023-09-13T15:53:06.902424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:53:07.268098image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=26)

M-Y
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
Nov-17
83928 
Nov-16
71528 
Mar-18
41955 
May-17
34736 
May-18
 
27867
Other values (21)
324510 

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters3507144
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowJul-16
2nd rowJul-16
3rd rowJul-16
4th rowJul-16
5th rowJul-16

Common Values

ValueCountFrequency (%)
Nov-17 83928
 
14.4%
Nov-16 71528
 
12.2%
Mar-18 41955
 
7.2%
May-17 34736
 
5.9%
May-18 27867
 
4.8%
Feb-18 26916
 
4.6%
Aug-17 25083
 
4.3%
Apr-17 21678
 
3.7%
Jun-17 19793
 
3.4%
Mar-17 19534
 
3.3%
Other values (16) 211506
36.2%

Length

2023-09-13T15:53:07.773811image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nov-17 83928
 
14.4%
nov-16 71528
 
12.2%
mar-18 41955
 
7.2%
may-17 34736
 
5.9%
may-18 27867
 
4.8%
feb-18 26916
 
4.6%
aug-17 25083
 
4.3%
apr-17 21678
 
3.7%
jun-17 19793
 
3.4%
mar-17 19534
 
3.3%
Other values (16) 211506
36.2%

Most occurring characters

ValueCountFrequency (%)
- 584524
16.7%
1 584524
16.7%
7 290920
 
8.3%
8 159695
 
4.6%
N 155456
 
4.4%
o 155456
 
4.4%
v 155456
 
4.4%
a 150159
 
4.3%
6 133909
 
3.8%
M 124092
 
3.5%
Other values (17) 1012953
28.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1169048
33.3%
Lowercase Letter 1169048
33.3%
Dash Punctuation 584524
16.7%
Uppercase Letter 584524
16.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 155456
13.3%
v 155456
13.3%
a 150159
12.8%
u 122195
10.5%
r 95580
8.2%
e 92000
7.9%
y 62603
 
5.4%
n 60597
 
5.2%
c 59822
 
5.1%
p 58115
 
5.0%
Other values (4) 157065
13.4%
Uppercase Letter
ValueCountFrequency (%)
N 155456
26.6%
M 124092
21.2%
J 99748
17.1%
A 82605
14.1%
F 38777
 
6.6%
O 30623
 
5.2%
D 29199
 
5.0%
S 24024
 
4.1%
Decimal Number
ValueCountFrequency (%)
1 584524
50.0%
7 290920
24.9%
8 159695
 
13.7%
6 133909
 
11.5%
Dash Punctuation
ValueCountFrequency (%)
- 584524
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1753572
50.0%
Latin 1753572
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 155456
 
8.9%
o 155456
 
8.9%
v 155456
 
8.9%
a 150159
 
8.6%
M 124092
 
7.1%
u 122195
 
7.0%
J 99748
 
5.7%
r 95580
 
5.5%
e 92000
 
5.2%
A 82605
 
4.7%
Other values (12) 520825
29.7%
Common
ValueCountFrequency (%)
- 584524
33.3%
1 584524
33.3%
7 290920
16.6%
8 159695
 
9.1%
6 133909
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3507144
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 584524
16.7%
1 584524
16.7%
7 290920
 
8.3%
8 159695
 
4.6%
N 155456
 
4.4%
o 155456
 
4.4%
v 155456
 
4.4%
a 150159
 
4.3%
6 133909
 
3.8%
M 124092
 
3.5%
Other values (17) 1012953
28.9%

FY
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
FY18
306883 
FY17
254706 
FY19
 
22935

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2338096
Distinct characters6
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFY17
2nd rowFY17
3rd rowFY17
4th rowFY17
5th rowFY17

Common Values

ValueCountFrequency (%)
FY18 306883
52.5%
FY17 254706
43.6%
FY19 22935
 
3.9%

Length

2023-09-13T15:53:08.181580image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-09-13T15:53:08.618328image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
ValueCountFrequency (%)
fy18 306883
52.5%
fy17 254706
43.6%
fy19 22935
 
3.9%

Most occurring characters

ValueCountFrequency (%)
F 584524
25.0%
Y 584524
25.0%
1 584524
25.0%
8 306883
13.1%
7 254706
10.9%
9 22935
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1169048
50.0%
Decimal Number 1169048
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 584524
50.0%
8 306883
26.3%
7 254706
21.8%
9 22935
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
F 584524
50.0%
Y 584524
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1169048
50.0%
Common 1169048
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 584524
50.0%
8 306883
26.3%
7 254706
21.8%
9 22935
 
2.0%
Latin
ValueCountFrequency (%)
F 584524
50.0%
Y 584524
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2338096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
F 584524
25.0%
Y 584524
25.0%
1 584524
25.0%
8 306883
13.1%
7 254706
10.9%
9 22935
 
1.0%

Customer ID
Real number (ℝ)

HIGH CORRELATION 

Distinct115326
Distinct (%)19.7%
Missing11
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean45790.512
Minimum1
Maximum115326
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-09-13T15:53:09.067070image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile548
Q113516
median42856
Q373536
95-th percentile107029
Maximum115326
Range115325
Interquartile range (IQR)60020

Descriptive statistics

Standard deviation34414.962
Coefficient of variation (CV)0.75157409
Kurtosis-1.1258359
Mean45790.512
Median Absolute Deviation (MAD)29897
Skewness0.32626641
Sum2.676515 × 1010
Variance1.1843896 × 109
MonotonicityNot monotonic
2023-09-13T15:53:09.609758image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
85775 2524
 
0.4%
163 2349
 
0.4%
35 1877
 
0.3%
33 1397
 
0.2%
31025 1369
 
0.2%
806 1310
 
0.2%
1404 1269
 
0.2%
767 1234
 
0.2%
820 1190
 
0.2%
58 1182
 
0.2%
Other values (115316) 568812
97.3%
ValueCountFrequency (%)
1 1
 
< 0.1%
2 2
 
< 0.1%
3 5
 
< 0.1%
4 428
0.1%
5 1
 
< 0.1%
6 2
 
< 0.1%
7 4
 
< 0.1%
8 2
 
< 0.1%
9 1
 
< 0.1%
10 2
 
< 0.1%
ValueCountFrequency (%)
115326 1
 
< 0.1%
115325 2
< 0.1%
115324 1
 
< 0.1%
115323 1
 
< 0.1%
115322 2
< 0.1%
115321 1
 
< 0.1%
115320 3
< 0.1%
115319 3
< 0.1%
115318 1
 
< 0.1%
115317 1
 
< 0.1%

Interactions

2023-09-13T15:52:28.526050image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:00.916840image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:06.486654image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:10.125577image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:15.123716image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:20.385703image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:24.037615image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:29.749350image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:02.112158image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:06.987367image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:10.766207image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:15.884280image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:21.109290image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:24.527335image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:30.621850image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:02.937688image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:07.542051image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:11.353870image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:16.603868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:21.587016image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:25.006061image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:31.306458image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:03.684256image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:08.030772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:11.964523image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:17.270488image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:22.081738image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:25.462800image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:32.040037image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:04.457816image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:08.636424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:12.697103image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:17.987078image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:22.594444image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:25.973509image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:32.682671image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:05.056473image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:09.100159image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:13.489650image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:18.672685image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:23.041184image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:26.438246image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:33.539180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:05.809042image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:09.602893image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:14.335166image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:19.558178image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:23.530908image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-09-13T15:52:27.539614image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-09-13T15:53:10.076468image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
item_idpriceqty_orderedgrand_totaldiscount_amountMonthCustomer IDstatuscategory_name_1payment_methodBI StatusYearM-YFY
item_id1.0000.1790.1690.2930.143-0.2750.7670.1530.1940.2890.2130.9270.8910.778
price0.1791.000-0.1630.7300.282-0.0570.2210.0280.0450.0700.0390.0520.0540.067
qty_ordered0.169-0.1631.0000.058-0.049-0.1580.1260.0010.0050.0060.0020.0030.0110.002
grand_total0.2930.7300.0581.0000.247-0.1130.2770.0000.0040.0120.0010.0030.0160.003
discount_amount0.1430.282-0.0490.2471.0000.0570.0990.0160.0470.0660.0200.0090.0380.021
Month-0.275-0.057-0.158-0.1130.0571.000-0.2060.1110.1680.1760.1550.4861.0000.369
Customer ID0.7670.2210.1260.2770.099-0.2061.0000.1050.2090.2220.1540.6730.5590.574
status0.1530.0280.0010.0000.0160.1110.1051.0000.0830.1540.8160.2250.1310.216
category_name_10.1940.0450.0050.0040.0470.1680.2090.0831.0000.1350.1600.2720.1840.199
payment_method0.2890.0700.0060.0120.0660.1760.2220.1540.1351.0000.3300.3590.2430.410
BI Status0.2130.0390.0020.0010.0200.1550.1540.8160.1600.3301.0000.1800.2240.167
Year0.9270.0520.0030.0030.0090.4860.6730.2250.2720.3590.1801.0001.0000.535
M-Y0.8910.0540.0110.0160.0381.0000.5590.1310.1840.2430.2241.0001.0001.000
FY0.7780.0670.0020.0030.0210.3690.5740.2160.1990.4100.1670.5351.0001.000

Missing values

2023-09-13T15:52:37.210081image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-09-13T15:52:41.847427image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-09-13T15:52:49.603372image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

item_idstatuscreated_atskupriceqty_orderedgrand_totalincrement_idcategory_name_1sales_commission_codediscount_amountpayment_methodWorking DateBI StatusMVYearMonthCustomer SinceM-YFYCustomer ID
0211131complete2016-07-01kreations_YI 06-L1950.011950.0100147443Women's Fashion\N0.0cod07-01-16#REF!1,950201672016-07-01Jul-16FY171.0
1211133canceled2016-07-01kcc_Buy 2 Frey Air Freshener & Get 1 Kasual Body Spray Free240.01240.0100147444Beauty & Grooming\N0.0cod07-01-16Gross240201672016-07-01Jul-16FY172.0
2211134canceled2016-07-01Ego_UP0017-999-MR02450.012450.0100147445Women's Fashion\N0.0cod07-01-16Gross2,450201672016-07-01Jul-16FY173.0
3211135complete2016-07-01kcc_krone deal360.0160.0100147446Beauty & GroomingR-FSD-52352300.0cod07-01-16Net360201672016-07-01Jul-16FY174.0
4211136order_refunded2016-07-01BK7010400AG555.021110.0100147447Soghaat\N0.0cod07-01-16Valid1,110201672016-07-01Jul-16FY175.0
5211137canceled2016-07-01UK_Namkino All In One 200 Gms80.0180.0100147448Soghaat\N0.0cod07-01-16Gross80201672016-07-01Jul-16FY176.0
6211138complete2016-07-01kcc_krone deal360.0160.0100147449Beauty & Grooming\N300.0cod07-01-16Net360201672016-07-01Jul-16FY177.0
7211139complete2016-07-01UK_Namkino Mix Nimco 400 Gms170.01170.0100147450Soghaat\N0.0cod07-01-16Net170201672016-07-01Jul-16FY176.0
8211140canceled2016-07-01Apple iPhone 6S 64GB96499.0196499.0100147451Mobiles & Tablets\N0.0ublcreditcard07-01-16Gross96,499201672016-07-01Jul-16FY178.0
9211141canceled2016-07-01Apple iPhone 6S 64GB96499.0196499.0100147452Mobiles & Tablets\N0.0mygateway07-01-16Gross96,499201672016-07-01Jul-16FY178.0
item_idstatuscreated_atskupriceqty_orderedgrand_totalincrement_idcategory_name_1sales_commission_codediscount_amountpayment_methodWorking DateBI StatusMVYearMonthCustomer SinceM-YFYCustomer ID
584514905196paid2018-08-28MEFGUL5A9F882AA5B99-361299.010.0100562381Men's FashionNaN0.0customercredit8/28/2018Valid1,299201882018-06-01Aug-18FY19111132.0
584515905198paid2018-08-28MEFPAK5B360B03C6B72999.010.0100562381Men's FashionNaN0.0customercredit8/28/2018Valid999201882018-06-01Aug-18FY19111132.0
584516905199pending2018-08-28MATINF59BAB39FDBEF16760.0213770.0100562382Mobiles & TabletsNaN0.0jazzvoucher8/28/2018Gross13,520201882016-09-01Aug-18FY198123.0
584517905200cod2018-08-28WOFVAL59D5EA84167F9-M400.01550.0100562383Women's FashionNaN0.0cod8/28/2018Valid400201882018-08-01Aug-18FY19115325.0
584518905202cod2018-08-28WOFNIG5B4D7EB0E9FDD-L499.01649.0100562384Women's FashionNaN0.0cod8/28/2018Valid499201882018-08-01Aug-18FY19115325.0
584519905204cod2018-08-28WOFSCE5AE00357AECDE699.01849.0100562385Women's FashionNaN0.0cod8/28/2018Valid699201882018-08-01Aug-18FY19115320.0
584520905205processing2018-08-28MATHUA5AF70A7D1E50A35599.0135899.0100562386Mobiles & TabletsNaN0.0bankalfalah8/28/2018Gross35,599201882018-08-01Aug-18FY19115326.0
584521905206processing2018-08-28MATSAM5B6D7208C6D30129999.02652178.0100562387Mobiles & TabletsNaN0.0bankalfalah8/28/2018Gross259,998201882018-07-01Aug-18FY19113474.0
584522905207processing2018-08-28MATSAM5B1509B4696EA87300.02652178.0100562387Mobiles & TabletsNaN0.0bankalfalah8/28/2018Gross174,600201882018-07-01Aug-18FY19113474.0
584523905208processing2018-08-28MATSAM5B10F91A9B6AB108640.02652178.0100562387Mobiles & TabletsNaN0.0bankalfalah8/28/2018Gross217,280201882018-07-01Aug-18FY19113474.0